Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LangChain Community: VectorStores: Azure Cosmos DB Mongo vCore with DiskANN #27329

Merged
merged 18 commits into from
Dec 12, 2024

Conversation

fatmelon
Copy link
Contributor

@fatmelon fatmelon commented Oct 14, 2024

Description

Add a new vector index type diskann to Azure Cosmos DB Mongo vCore vector store. Paper of DiskANN can be found here DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node.

Sample Usage

from pymongo import MongoClient

# INDEX_NAME = "izzy-test-index-2"
# NAMESPACE = "izzy_test_db.izzy_test_collection"
# DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")

client: MongoClient = MongoClient(CONNECTION_STRING)
collection = client[DB_NAME][COLLECTION_NAME]

model_deployment = os.getenv(
    "OPENAI_EMBEDDINGS_DEPLOYMENT", "smart-agent-embedding-ada"
)
model_name = os.getenv("OPENAI_EMBEDDINGS_MODEL_NAME", "text-embedding-ada-002")

vectorstore = AzureCosmosDBVectorSearch.from_documents(
    docs,
    openai_embeddings,
    collection=collection,
    index_name=INDEX_NAME,
)

# Read more about these variables in detail here. https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search
maxDegree = 40
dimensions = 1536
similarity_algorithm = CosmosDBSimilarityType.COS
kind = CosmosDBVectorSearchType.VECTOR_DISKANN
lBuild = 20

vectorstore.create_index(
            dimensions=dimensions,
            similarity=similarity_algorithm,
            kind=kind ,
            max_degree=maxDegree,
            l_build=lBuild,
        )

Dependencies

No additional dependencies were added

Copy link

vercel bot commented Oct 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 12, 2024 1:54am

@fatmelon fatmelon marked this pull request as ready for review October 14, 2024 06:44
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. community Related to langchain-community Ɑ: vector store Related to vector store module labels Oct 14, 2024
@fatmelon
Copy link
Contributor Author

@baskaryan could you help review this PR? We have some users waiting for this feature, thanks :)

@fatmelon
Copy link
Contributor Author

@efriis could you help review this PR? We have some users waiting for this feature, thanks :)

@fatmelon
Copy link
Contributor Author

@eyurtsev could you help review this PR? We have some users waiting for this feature, thanks :)

@fatmelon
Copy link
Contributor Author

hi @ccurme , @vbarda , @hwchase17 could anyone review this PR? We have some users waiting for this feature, thanks :)

@fatmelon
Copy link
Contributor Author

fatmelon commented Nov 4, 2024

@isahers1 could you review this PR? I @-mention the recommended people, but they didn't come to see it ....

@fatmelon
Copy link
Contributor Author

@baskaryan, @efriis, @eyurtsev, @ccurme, @vbarda, @hwchase17, could you help review this PR? We have some users waiting for this feature, thanks :)

@aayush3011
Copy link
Contributor

Friendly ping @baskaryan for a review here.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 12, 2024
@efriis efriis enabled auto-merge (squash) December 12, 2024 01:45
@efriis
Copy link
Member

efriis commented Dec 12, 2024

Also, if you're interested in maintaining this integration without us in the loop, we'd love to get an integration package out! Future PRs against langchain would just be {docs updates, as well as registering your package in libs/packages.yml, deprecating this community integration in favor of your integration package}

Here's the guide, and if you have questions, feel free to leave them in the comments on those pages so others can see them! https://python.langchain.com/docs/contributing/how_to/integrations/

@efriis
Copy link
Member

efriis commented Dec 12, 2024

You should reach out to @marlenezw for help getting set up in the https://github.com/langchain-ai/langchain-azure repo if you want to host out of there!

@efriis efriis self-assigned this Dec 12, 2024
@efriis efriis merged commit d1e0ec7 into langchain-ai:master Dec 12, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community lgtm PR looks good. Use to confirm that a PR is ready for merging. size:XL This PR changes 500-999 lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants